Skip to content

Fix GitHub API endpoint selection based on host parsing#1850

Open
bc-lee wants to merge 4 commits intomicrosoft:mainfrom
bc-lee:fix-github-api-url
Open

Fix GitHub API endpoint selection based on host parsing#1850
bc-lee wants to merge 4 commits intomicrosoft:mainfrom
bc-lee:fix-github-api-url

Conversation

@bc-lee
Copy link

@bc-lee bc-lee commented Nov 17, 2025

Previously, when GITHUB_SERVER_URL was set, vcpkg always treated it as a GitHub Enterprise Server(GHES) endpoint and appended /api/v3 for GITHUB_SERVER_URL=https://github.com, this produced an incorrect endpoint (https://github.com/api/v3/...).

This change special-cases the exact value https://github.com and uses https://api.github.com for dependency graph submission. All other server URLs keep the existing behavior (<server>/api/v3/...).

Added tests for endpoint selection behavior in src/vcpkg-test/downloads.cpp.

Fixes microsoft/vcpkg#48347

}

// Extracts the host part from a URL string.
std::string extract_host(StringView url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thing needs unit tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL parsing is an insanely complicated problem; a parsing function that just returns whether we get github.com or not may be easier. Otherwise we may want to look at using trurl.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I’ll revisit this once the curl library integration has been re-landed.

Comment on lines +54 to +55
const char* at_sign = Strings::find_first_of(url, "@");
size_t at_pos = at_sign[0] == '\0' ? std::string::npos : static_cast<size_t>(at_sign - url.data());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is undefined behavior: you can't dereference the returned iterator if there was no match. And there is no requirement that there is a \0 after a StringView

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto several times below

}

// Remove userinfo if present (e.g., user:pass@host)
const char* at_sign = Strings::find_first_of(url, "@");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems little reason to use find_first_of when there is only one char you're looking for?

@bc-lee bc-lee marked this pull request as draft November 25, 2025 02:37
@BillyONeal
Copy link
Member

Ah crap it looks like the URL APIs were added in 7.62.0 which is more recent than we currently assume. So we might be stuck using a handwritten thing like this or possibly copy/pasta-ing the relevant bits from libcurl's urlapi.c (with attribution)

@BillyONeal
Copy link
Member

(I guess that most systems GitHub cares about would have greater than 7.62 so that may be fine)

@bc-lee
Copy link
Author

bc-lee commented Feb 16, 2026

I reviewed your suggestion to use libcurl’s URL API for URL parsing.

  • curl’s urlapi.c is fairly complex and tightly coupled to other parts of curl. If the goal is to minimize drift from upstream, embedding that code in vcpkg-tool is still hard to read and maintain. If the goal is to minimize code size, we effectively end up rewriting URL parsing logic ourselves, which is risky given how subtle URL parsing can be.

While reading https://learn.microsoft.com/en-us/vcpkg/concepts/supported-hosts and https://github.com/microsoft/vcpkg-tool/blob/792f2dba1715f180412eef13b4de55f8085c82ca/cmake/FindLibCURL.cmake, I noticed that vcpkg-tool is currently too permissive about the minimum libcurl version it accepts. The libcurl URL API was introduced in 7.62.0 (released on 2018-10-31), and the lowest supported libcurl version on Linux is 7.74.0, which is already newer than 7.62.0. That means we should be able to use the libcurl URL API within vcpkg-tool’s supported host baseline.

If you’re OK with that direction, I’d like to split this work into two PRs:

  • Bump the minimum required libcurl version to 7.74.0 in FindLibCURL.cmake and update the documentation accordingly.
  • Use libcurl’s URL parsing functions in src/vcpkg/base/downloads.cpp instead of a custom implementation.

@BillyONeal
Copy link
Member

@ras0219-msft suggests (and @MahmoudGSaleh agrees): rather than doing the whole URI parsing, can we just check "if it's https://github.com" (just a string match) do the special github thing, otherwise do the existing behavior? That way we don't need to make the curl situation more complicated.

(And I'm sorry for not noticing this before when I first reviewed this 😅)

Previously, any provided GitHub URL was treated as a GitHub Enterprise
Server (GHES) instance when selecting the API endpoint. This assumption was
incorrect because the URL can also be for GitHub.com (e.g., https://github.com),
which led to using the wrong API endpoint.

This commit adds a function to extract the host from a URL. This is used to
differentiate between GitHub.com and GHES instances to select the correct
API endpoint.
@bc-lee bc-lee force-pushed the fix-github-api-url branch from 79462ea to da9910c Compare February 18, 2026 23:43
@bc-lee bc-lee marked this pull request as ready for review February 18, 2026 23:44
@bc-lee bc-lee requested a review from BillyONeal February 18, 2026 23:44
@bc-lee
Copy link
Author

bc-lee commented Feb 18, 2026

I simplified the logic significantly to align with your suggestion. Could you please take another look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dependency graph submission failed

2 participants